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Abstract 

This paper discusses the role of opportunistic punisher who may act 
selfishly to free-ride cooperators or not to be exploited by defectors. To 
consider opportunistic punisher, we make a change to the sequence of 
one-shot public good game; instead of putting action choice first be- 
fore punishment, the commitment of punishment is declared first before 
choosing the action of each participant. In this commitment-first set- 
ting, punisher may use information about her team, and may defect to 
increase her fitness in the team. Reversing sequence of public good game 
can induce different behavior of punisher, which cannot be considered in 
standard setting where punisher always chooses cooperation. Based on 
stochastic dynamics developed by evolutionary economists and biologists, 
we show that opportunistic punisher can make cooperation evolve where 
cooperative punisher fails. This alternative route for the evolution of 
cooperation relies paradoxically on the players' selfishness to profit from 
others' unconditional cooperation and defection. 
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1 Introduction 



Public good game (PGG) is one of the most active research themes in eco- 
nomics and evolutionary biology for last ten years. In typical PGG experi- 
ments, G individuals have the opportunity to cooperate and bestow a fixed 
amount q into a common resource, or to defect and input nothing. The total 
amount in the common is multiplied by a factor r and equally distributed 
among the members without regard to their contributions. Average return of 
unit investment is qr/G where G is the size of a team. If r < G, a rational 
player does not contribute, and Nash equilibrium is universal defection. But, 
assuming that all of members cooperates, the return is (r — l)q that is larger 
than that from defection. 

This is a classic social dilemma that the Nash equilibrium is different from 
the social optimum. As for the case of Prisoner's Dilemma, academic inter- 
ests around PGG also have focused on ways and mechanisms that make the 
evolution of cooperation possible. For this purpose, besides from introducing 
reputation effect by repeating games, two theoretical methods has been pro- 
posed: i) stern punishment that is unrelated to payoff consideration ii) the 
option that players exit from the game. 



For the first direction, Fehr and Gachter (2000) shows experimental evi 



dences by conducting two-stage version of PGG. 1 Obviously, sub-game perfect 
equilibrium is that agents never punish in stage two as it lowers their payoff; 
hence punishment is not a factor in decisions in stage one; hence no contribu- 
tions in stage one as usual. But, ample of experimental studies consistently 
show that availability of the punishment mechanism increases contributions 
markedly relative to their absence. Further, some punishment does occur 
actually. 

This influential work has been followed by numerous studies that explore 



In stage one, four subjects play a simple PGG, and in stage two, contributions of 
individual members are revealed and any member of the four-player group may choose to 
reduce the earnings of any of the other members of the group at cost to himself. 



around punishment, and this implies that punishment in PGG is a key part 
of institutional and behavioral mechanism to overcome the social dilemma of 



cooperation (Ledyard, 1997 Sigmund 2007). But, a big piece of puzzle about 
punishment is that punishment itself cannot be evolutionarily favored because 
this behavior cannot get higher payoff or fitness than defection. Let us imag- 
ine the situation that all of population consists of defectors. When a mutant 
punisher comes about, her payoff cannot exceed that of other defectors as long 
as sufficient cost of punishment is imposed. Even though the role of punish- 
ment in the evolution of cooperation in PGG may be reasonably accepted, 
this behavior may not be selected and survive in evolutionary process. 

Another direction of research based on evolutionary dynamics tries the 
power of exit options that makes players avoid worst outcomes in PGG. The 
idea is that defection in PGG can be circumvented by making players choose 
an option that has an intermediate value between universal defection and high- 



frequency cooperation. Brandt et al. (2003) shows that exit option makes evo- 
lutionary cycle among cooperation, defection and exit by replicator dynamics. 
According to its conclusion, however, this evolutionary cycle by exit option 
cannot help the evolution of cooperation in that three strategy enjoys same 
payoff or fitness, which means that the participation in PGG is not better 
than exit option. 



Hauert et al. (2007) proposes that two directions may be interwoven in 



the evolution of cooperation. Their Intuition is that defectors may break the 
homogeneous population of cooperators, but the equilibrium based on univer- 
sal defection can be also shaken by the exit option. When all of population 
chooses exiting, cooperation or cooperation with punishment is better choice 
for players. If we focus on homogeneous sates, these four states are in evolu- 



tionary cycles. Based on evolutionary dynamics, Hauert et al. (2007) shows 
that cooperative state can be dominant state, which is a route for the evolution 
of cooperation in one-shot PGG game. 
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Based on former studies, this paper tries to consider an unexplored theoret- 
ical element in PGG by stochastic evolutionary dynamics. We investigate the 
role and the effectiveness of punishment in PGG by assuming slightly different 
setting of PGG and behavioral pattern of punisher. For this, the sequence of 
standard PGG, strategy choice first and punishment with the information of 
players' action, is reversely arranged. So to speak, players commit their pun- 
ishment first, and choices action with this information. Also, we introduce the 
opportunistic punisher who chooses its strategy based on the number of pun- 
isher in her team. Hence, for this new type of punisher, the information about 
punishing commitment plays a key role in choosing their actions. Even though 
defection of the punisher can hurt herself if she chooses defection, this choice 
may pay when there is sufficiently large number of cooperators. Intuitively, 



different from Hauert et al. (2007) where the punisher is originated from co- 
operator, our punisher is opportunistic in that they deviates from cooperative 
strategy when it pays. 

The organization of the paper is following: Section [2] succinctly describes 
the basic of our PGG and methodology we reply on, stochastic (adaptive) 
dynamics. Section [3] shows numerical cases our main discussion. Section [4] 
presents theoretical extension of this paper with assuming Section [5] is con- 
cluding remarks. 



2 Setup and Method 

PGG, Nash equilibrium and Punishment 

This paper is based on a G-person game called Public Good Game. We 
consider a well-mixed population of constant size M > 2, and G individuals 
are randomly selected and offered the option to participate PGG. Each should 
decide whether to contribute for the public good or not; cooperate (C) or 
defect (D). For simplicity, players invest fixed an amount c, we assume that 
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the contributions of all G c cooperators are multiplied by r > 1 and then 
divided among all G players participating in the game. The payoff for each C 
and D are given by 



Nash equilibrium is easily given by considering the benefit generated by 
switching from C to D, which is c(l — ^). It is obvious that players would 
play D as long as r < G. This is the social dilemma that resembles Prisoner's 
Dilemma. Most of interests around evolutionary game theory lie in finding 
routes or mechanism to overcome this uncooperative state. Can this social 
dilemma be evaded through positive or negative devices specifically directed 
towards individual players? In this paper, we shall focus on negative and 
neutral mechanism: punishment and exit. 2 

When she quits, a is her payoff. We call her the Loner (L). When she 
participates, her types differentiate her act in a team. The cooperator (C) con- 
tributes c = 1 amount to the team, and the defector (D) does not contribute, 
but free-rides on other C in her team. After this first round interaction, each 
team member can impose a fine f3 upon each target at a personal cost 7 for 
each fine. The punisher (P) does this costly behavior against its own benefit. 
For following discussion, basic parameters are summarized as follows: 

For considering stochastic dynamics in finite populations, the groups en- 
gaging in a public goods game are given by multivariate hyper-geometric sam- 
pling. This sampling affects payoffs of each interaction between two types. 
Resulting payoffs, fixation probabilities and limiting distribution are given in 
Appendix B 

2 Recently, some experimental evidences show that positive devices can be more effective 
in inducing cooperation among participants. In this paper, we remain around punishment 
issue, which has been more intensively discussed topics. 




for D 



for C. 
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Parameters 


Description 


M 


The size of total population 


G 


The size of PGG group 


r 


The multiplier of PGG 


P 


The amount of punishment on a target per punishment 


7 


The cost of punishment incurred per punishment 


a 


The payoff of loner leaving a team 



Table 1: Parameters of PGG. 



Stochastic (adaptive) dynamics 

While replicator dynamics provide numerous crucial insights, they are funda- 
mentally based on deterministic dynamics in an arbitrarily large, sometimes 
infinite, population. Theoretical discussions to overcome this limitation have 
considered for a long time in various fields such as theoretical ecology, eco- 
nomics or sociology. This paper focuses on a concept developed by economists 
and evolutionary biologists, stochastic (adaptive) dynamics of finite popula- 
tions. 

In evolutionary game theory, stochastic (adaptive) dynamics was intro- 
duced to understand long-run behavior, which may differ fundamentally from 
the behavior of the deterministic process by law of large number, replicator 
dynamics. In replicator dynamics, a state is locally asymptotically stable 
if any sufficiently small deviation from the original state vanishes. |Young 



( 1993 ) criticizes this approach because it treats shocks as if they were isolated 
events. Considering that economic system has constant perturbation from 
various sources, this assumption of arbitrarily small shock is unsatisfactory. 

Especially, persistent shocks can accumulate and tip the process out of the 
basin of attraction of asymptotically stable state. Thus, when shock is persis- 
tent, generally accepted equilibrium concept, evolutionarily stable strategies, 
cannot be used to explain long-term behavior of economic system. Especially, 
this theory can predict the probability of staying in different equilibria inde- 
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pendently of the initial conditions. The persistent shocks act as a selection 
mechanism, and the selection intensity increases the less likely the shocks are. 
In the long-run distribution relies on the probability of escaping from various 
states, and this are the function of exponential in error rate. This idea was 



firstly formalized by Freidlin and Wentzell (1998). 3 

Stochastic stability was used to the problem of equilibrium selection in 



games by Kandori et al. (1993) and Young (1993). But, these economic ap- 
plications are based on "order-of-magnitude" comparisons for the transitions 



between the various recurrent classes of no- mutation process (Ellison, 2000). 
By this method, one state can be selected as a long-term equilibrium, which is 
perturbed least by adaptive dynamics, as mutation is trivialized as necessary. 



Taylor et al. (2004) analyzes a similar but different version of stochastic 



no-mutation process, where a single mutation can lead to a transition from 
one absorbing state to another. In this theory, the equilibrium depends on 
the "expected speed of flow" at every absorbing state. This assumes that 
a single mutant can escape each absorbing state from other types, and the 
fate of this mutant is determined by fixation probability of two underlying 



types. Also, Fudenberg and Imhof (2006) shows that there exists sufficiently 
small mutation rate that no two individual mutant types cannot coexist. So 
to speak, the fate of a mutant, its elimination or fixation, is settled before the 
next mutant appears. Thus the transitions between each homogeneous state 
occur when a mutant appears and spreads to fixation. 

The advantage of this model is that transition matrix can be nicely for- 
mulated by a Markov chain with state space that consists of each homoge- 
neous sate and fixation probability of each state against one another. For 
this Markov-style transition matrix, unique vectors can be calculated, which 
is interpreted as invariant distribution of underlying stochastic process. Com- 



Their idea is that small mutation term makes the system have a different stability for 
each state, then the limit of invariant distribution can be derived as the mutation probability 
goes to zero ( Ren and Zhang 2008 1 . 
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pared with Kandori et al. ( 1993 ) and Young ( 1993 ), this method shows relative 
probabilities that each homogeneous state spend with respect to competing 
others. Appendix A summarizes the method of Fudenberg and Imhof (2006). 



3 Cooperative vs. Opportunistic Punishment 

To consider different setting and role of punisher in PGG, we look at typical 
numerical examples by stochastic dynamics. The results in this section are 
based on Hauert et al. ( |2007 ) (Appendix B provides short description of the 



game and payoff functions). 
Sequence of one-shot PGG 



Most of researches including Hauert et al. (2007) has assumed an one-shot 



PGG where players decide strategy or action first, and punish accordingly if 
some of them want to. In this standard setup, the punisher is originated from 
C and punishing behavior depends on the information on action choices of 
players. What if this sequence be reversed? Players commit their punishment 
first, and choose its action later. For simplicity, commitment is assumed to be 
always credible, and the number of commitment is announced for participants 
of a team. For C, D and L who do not care for doing punishment, this reversed 
sequence may not affect their actions. For P, however, this information may 
be crucial in that it conveys information about its own type. Thus, P can 
choose her action depending on this information. 

By reversing the sequence of PGG, we can discern two types of punisher: 
cooperative (CP) and opportunistic punisher (OP). CP commits punish- 
ment, but cooperates regardless of information of commitment. OP commits 
punishment, but chooses whether to cooperate or not depending on the in- 
formation. If there be few commitment, she might think that D would be 
better choice to free- ride C or not to be exploited by D. We assume that an 
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individual punisher acts cooperatively with probability q that replies on the 
number of punisher G p . q(-) is given by 



where 5 is the responsiveness of punisher. Appendix B describes payoff func- 
tions among C, D, L and P for OP case. For appropriate parameters, as is 
discussed in next section, opportunistic punisher makes three different evolu- 
tionary dynamics among four types by 5. For the lower range of S, oppor- 
tunistic punisher does not make cooperation evolve in a team. Opportunistic 
punisher makes almost universal cooperation possible for the mid range of 5 
when four types can evolved. Finally, for the upper range of 5, P can kill 
D without the help of L. For later discussion, five types of players who are 
casted in this paper are summarized for their actions and punishing behavior. 



Name Action Punishment 

C cooperation No 

D defection No 

L exit No 

CP a cooperation Yes 

OP h conditional cooperation Yes 

a Cooperative punisher who cooperates uncondition- 
ally and punishes defectors in her team. 

b Opportunistic punisher who cooperates depending 
on the level of punishing commitment. 

Table 2: 5 types of players 



Stochastic dynamics of CP case 

Main results of Hauert et al. (2007) are regenerated in Figure [T] At first, with 
voluntary participation, PGG takes circular movement around Cooperator 
(C) — Defector (D) — Loner (L). The existence of L can perturb universal 
defection, and make evolutionary cycle for three types, and this can be also 



observed by replicator dynamics (Hauert et al. 2002 ). 4 Even though voluntary 
participation changes universal defection in PGG, the average payoff a player 
can get cannot exceed that of L. That is, volunteering itself does not enhance 
the fitness of team members in equilibrium. 5 



As is shown in Appendix C , assuming infinitely large population, C — P 
equilibrium can be stabilized for some area in S3 simplex. But, this result 
just shows that cooperation can be defended only when there already exists 
sufficient number of punisher. In stochastic dynamic setting, punishment alone 
cannot police D since the fitness of punisher cannot be higher than that of 
D ((b) of Fig. [TJ. This can be called "dilemma of punishment", which is 
that P can regulate defective behavior in a group, but the cost of punishment 
decrease the fitness of P. Eventually, unique homogeneous state stochastically 
stable is D because P cannot be always worse than D in homogeneous state 
of D. In sum, neither of L and P makes any significant contribution to the 
evolution of cooperation in PGG. 

Interesting dynamics can be made when four types of players are involved 
in PGG. Panel (c) of Figure [T] shows C — D — L — P interaction. D cannot 
fixate P because of the existence of L. L, however, tends to be conquered by C 
and P. The movement between C and P is random drift or neutral selection, 
where all individuals have the same fitness. For this case, Any random walk 
in which the probability to move to either side is identical for the transient 
states leads to the same result. D can be regulated in a circular stochastic 
relation among four types, and L plays a pivotal role in making a detour for 



the evolution of cooperation. Hauert et al. (2007) named this mechanism "via 
freedom to coercion" , which emphasize synergistic enforcement between L and 
P in the process. 

The role of L has meaningful economic interpretation where L can be re- 



4 Appendix C 



Sasaki et al 



provid es the technique and results by replicator dynamics. 
(20071 shows that when players can do mixed strategies of C or D with 



volunteering, better fitness can be obtained for some parameters. 
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(a) C — D — L case 



(b) C-D-P case 



(c) C-D-L-P case 



Figure 1: CP case stochastic dynamics of PGG. Parameters are given by 
s = 0.25, r = 3, M = 100, G = 5, = 1, 7 = 0.15, and a = 1. The 
percentage under each type is the relative staying frequency. Arrow from A 
to B means that A type can fixate B type, and fixation probability is given 
around the arrow. When a fixation probability is less than 1/M, it is rare event 
that one type fixates another. For dashed line, fixation probability is equal 
to 1/M that is the case of random drift. This automata-style presentation 
helps to illustrate stochastic dynamics of PGG. (a) shows rock-paper-scissor 
evolutionary cycle among C — D — L. (b) is the case that costly punishment 
cannot survive under C — D — P interaction. Finally, (c) is the evolution of 
cooperation with four types where P — C random drift consists of around 87%. 



garded to be the alternative provided by market outside organizations based 
on human cooperation. Namely, if the group of PGG can be considered as 
a team or a firm, L represents market. This issue about 'Organization vs 



Market' was treated by Alchian and Demsetz (1972). Their conclusion is that 



monitoring provided by incentive-compatible residual claimant can preserve 
the comparative advantage of organization over market. Evolutionary dynam- 
ics of PGG explore another possibility of regulating issue in team production 
without formal monitoring or hierarchy. When market provides attractive 
alternatives, defection in an organization can be regulated in the absence of 
direct monitoring. In this sense, prolific market and successful organization 
can co-evolve in our stochastic setting. 

Naturally, the evolution of cooperation in PGG with four types depends 
critically on underlying parameters. Low f3 activates fixation from P to D 
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(a) low /3 (b) low a (c) low j3 and low a 



Figure 2: CP Stochastic dynamics of PGG when parameters are unfavorable 
for CP. Common parameters are s = 0.25, r = 3, M = 100, G = 5. Parame- 
ters for (a) are [3 = 0.15, 7 = 0.15, a = 1, those for (b) are (3 = 1, 7 = 0.15, 
cr = 0.1, those for (c) are j3 = 0.15, 7 = 0.07, a = 0.1. Low f3 creates P — > D 
fixation, which decreases the frequency of P. Low a makes D —> L fixation 
slow, and the frequency of D increases. For related calculations, an algorithm 
is written by Mathematica version 7 of Wolfram Inc. 

(P — > D fixation). Figure [2] illustrates effects of (5 and a. For both parameters, 
the evolution of cooperation is destroyed as two values decrease. Low j3 creates 
P — > D fixation, which allows D to absorb both from C and P. As a decreases, 
the flow of D — > L also slows down, which increases staying frequency at In- 
state, (c) of Figure [2] also shows that lowering 7 does not change the frequency 
of cooperative states, staying at C or P. 

Stochastic dynamics of OP case 

As is stated, players declare their commitment on punishing first, and choose 
actions in OP case. Assuming independence in choosing actions, the expected 
number of P in a team, which is equal to the number of commitment, is simply 
given by q-G p . payoffs are modified for opportunistic punisher, and stochastic 
dynamics of OP case is given in Figure |3j 

Numerical examples implies that OP can contribute the evolution of co- 
operation in which CP loses her power as long as 5, the responsiveness of 
punisher, is sufficiently high. They imply that the efficacy of OP comes from 
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(c) high f3 and low a (d) low f3, low 7 and low a 



Figure 3: Stochastic dynamics of PGG when P is opportunistic. One-shot 
PGG proceeds to commit punishment, and choose action. Common param- 
eters are s = 0.25, r = 3, M = 100, G = 5, 5 = 0.8. Parameters for (a) are 
P = 1, 7 = 0.15, a = 1, those for (b) are /3 = 0.15, 7 = 0.15, a = 1, those for 
(c) are /3 = 1, 7 = 0.15, <r = 0.1, those for (d) are (3 = 0.15, 7 = 0.07, a = 0.1. 
As the figure shows, P — > C fixation is key to the evolution of cooperation. 
This dynamics is made by opportunism by P, which decreases the frequency 
of C, and exploitation by D is prevented. 
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Figure 4: The evolution of cooperation by OP mC—D—P and C — D — L — P. 
Parameters are equal to those in (d) of Figure [3| K-axis is sum of frequency in 
C and P. For 5 < 0.88, only four types, C — D — L — P can make cooperation 
evolve. As 5 grows, the evolution of cooperation can be made only by C—D—P 
interaction. For this case, L as a depressor of D is unnecessary. 



exploiting C and fighting D more successfully. When (3 and 7 is sufficiently 
high, OP cannot contribute anymore because the commitment of punishing 
hurts herself to a serious level. Thus, when punishing is more effective than 
a certain level, CP is more effective than OP in fostering the evolution of 
cooperation in a PGG team. 

Figure [4] shows an interesting dynamics of OP case. As 5 approaches 1, for 
some proper parameters, the help of the loner can be redundant. C — D — OP 
dynamics make 100% state of cooperation at P. When 5 is sufficiently high, 
.P's selfishness alone makes the evolution of cooperation. 



Let us compare stochastic dynamics of PGG to replicator dynamics. Ap- 



pendix C illustrates replicator dynamics of PGG. For C — D — OP interaction, 
equilibria by replicator dynamics agree well with equilibrium by stochastic dy- 
namics. When parameters are proper, C — D — OP interaction by replicator 
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Figure 5: The evolution of cooperation when P plays C with error or tremble 
rate 5. All parameters are equal to Figure |4j When S = (1), P plays D (C) 
always. Results imply that simple error cannot increase cooperation in a team 
compared to CP. This shows the strategical advantage of OP who plays more 
sophisticated strategy than simple types of error. 



dynamics make two type of NE . For this case, when (3 is low (but not too 
much), 7 is low, and 5 is high, C — OP mixture with high density of OP is 
NE. This is a equilibrium state that opportunism by P is sparsely observed 
because of high frequency of P. As is discussed, stochastic dynamics select 
this almost full cooperative state by OP over D-state when parameters are 
proper. 

Now, we consider how the opportunism of P helps the evolution of coop- 
eration. Let us compare the behavior of OP to simple tremble or error in 
playing action. As Figure [5] shows simple tremble does not help to overcome 
invasion of D or fixating C, which is key part that OP plays. This implies that 
opportunistic punisher has more sophisticated strategic reaction than simple 
types of error. 

The opportunism makes P play in a correlated way according to the com- 
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position of her team informed by the level of commitment. For example, when 
a team consists all of C or D except a P, OP hardly do C because commit- 
ment level is low, and punishment from others is not expected. Thus, for 
C-only team case, the payoff of OP is higher than CP. For Z)-only team 
case, the payoff of OP is less than that of D, but higher than that of CP. In 
the opposite instance, if a team consists all of P except one P, OP always 
does C because playing D hurts her by ample of punishment. For this case P 
resembles CP. That is, assuming proper size of G and sets of parameter such 
as /?, 7 and 5, payoff of OP in three pure state are given by 



tt op (— ^Q^r) > t^cp{— n c — r ~ 1) f° r ah-C case 

7T D (= — /3) ~ vr op (= - (G - 1)7) > tt cp (= -1 - (G - 1)7) for alkD case 
71 'op ~ n cp( z= r ~~ 1) f° r a U--P case, 



where n k is the payoff of type k. 

This strategic flexibility comes from nonlinearity made by probabilistic 
reaction modeled by q(G p ). When the degree of nonlinearity, 5, is sufficiently 
high, OP can copy better reaction between C and D in correlated way. This 
flexibility creates C — > P fixation, and ends P —> D fixation. 



4 Calculating Fixation Probabilities by Fermi 
Function 



As Appendix B shows, fixation probabilities of Moran process can be defined 
within a certain boundary of s, the intensity of selection. To generalize our 



model, a pair- wise comparison by Fermi function is to be introduced (Traulsen 



et al. 2006 Altrock and Traulsen, 2009). Fermi function defines dynamics of 
payoff difference between two types for any s. 6 When transition from A- type 



6 If we simply replace the fitness function of Moran process, (1 — s) + S7r, with e' 1 sS > +S7r 
to considering any s £ [0, 1], resulting term for fixation probability is identical. 
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(a) CP case (b) OP case 



Figure 6: Key fixation diagrams of CP and OP. We assume proper parameters 
for unrelated variables. When j3 is larger than a critical level, D can fixate 
CP and the evolution of cooperation is made. When CP is less effective, OP 
can absorb C as long as 5 is higher than a certain level. This prevents D from 
exploiting C, and the cooperation can be evolved by players' opportunistic 
behavior. 

to .B-type occurs, the probability is assumed to be 
1 

V = } \) 

I _|_ e s{lTA-KB) 

which is called called Fermi function. This makes 
T7 

1 — e -s{lTA-^B) 

n 

The evolution of cooperation can be analyzed by investigating fixation 
probabilities between four types. Specifically, as is implied in numerical ex- 
amples in Section [3j the evolution of cooperation may depend on the fixation 
between P and D, and that between P and C. 

We apply three approximations to get analytic expression. 1) As 1/M — > 
can be assumed for sufficiently high M, related payoffs can be linearized 
around 1/M ~ as many as necessary. 2) Approximated fixation probabilities 
are categorized into two; the one is surely larger than 1/M, the other cannot 
exceed it. We take the first kind as legitimate, and set the second to be 0. 
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3) Y^!k=i • • " ~ fx ' ' ' dk can be used because the error between two is 0((jj) 2 ), 
which is plausible for fairly larger M. 

Let pij denote this simplified fixation probability of single i-type in the 
population that consists totally of J-type. Common fixation probabilities for 
each type of punisher are given by 



Poc = s (! - q) 
Pld = sa 

2y/2s(G - l)(r - a - 1) 
PcL ~ 2vr + 2 v / 2s(G-l)(r- ( r- 1)' 

At first, these are invariant by the type of punisher. High intensity in- 
creases p DC and p LD . When a increases (decreases), the flow of D — > L speeds 
up(down), but that of L — > C speeds down(up). As is discussed in Section |3j 
when a gets smaller, the frequency of L decreases, and that of C increases 
consequently. Without P, this flow ends up with higher frequency of D due 
to lower p LD . 

At first, Figure [6] shows P — > D and P — > C fixation diagrams for each type 
of punisher. When ft < ft CP where where ft CP = gTgF^n > fixation probabilities 
are respectively given by 



( p DP = s[l-^-(G-l)ft] forCP 
p DP = for OP 

(a) of Figure [6] shows when CP is effective. When ft < ft CP , D can fixate 
P. The evolution of cooperation is hindered as numerical examples shows. For 
this instance, OP who plays opportunistically can make cooperation evolve in 
a team. C — > P fixation is a key mechanism, which weakens C — > D fixation. 
Different from CP case where P and C have equal fitness, OP has higher 
fitness than C because OP tends to play D more as there exists more C. As 
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the frequency of C decreases due to C — > P fixation, the relative staying at D 
does also. Numerical examples shows that p PC for OP changes discontinuously 
for a critical level of 5, S OP . This implies that players' responsiveness to the 
information can have a pivotal role in fostering the evolution of cooperation. 

The analytic approach for this intuition can be done by Fermi function. 
For CP, p PC is given by 1/M. For OP, when f3 and 7 are low as necessary, 



p pc is calculated as 

Ppc = 
Ppc 

op = 



for 5 < S r 



1 



G 2 exp[- 



whcrc 5 



s(G 2 -G(r+l) + rS) 

& 1 1 j 

Ga(G-r-l) + rsS ' 

G(G 2 ( f 9+ 7 )-G(/3+7+l)+2r+l) 
(2G-l)r+2(G-l)G(/3+ 7 ) 



for 5 > 5. 



OPi 



. By Fermi function, the fixation prob- 



ability of p PC for OP behaves nicely in discontinuous way as numerical exam- 
ples do. 



5 Concluding Remarks 

For the gaming situations in which Nash equilibrium predicts general defec- 
tion, the possibility of cooperation is one of the most challenging and crucial 
questions of evolutionary economics and biology. This paper, in stochastic 
dynamic setting, discusses an intriguing and paradoxical path to cooperation 



via players' opportunistic behavior. Different from Hauert et al. (2007) that 
emphasizes the role of quitting to support P who can regulate D, by reversing 
the sequence of PGG, we propose that the opportunistic behavior of P may 
paradoxically make cooperation evolve in a team. Moreover, for the cases 
that altruistic punisher cannot help the evolution of cooperation, our oppor- 
tunistic punisher can. This comes from the dual role of opportunism: OP 
can end P — > D fixation, and make C — > P fixation. Both fixating flows de- 
crease relative staying frequency at -D-state, which encourages the evolution 
of cooperation. 
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Finally, two future research agenda is to be mentioned. First, in this paper, 
we regard 5 as parameter, which determines the responsiveness of probabilistic 
opportunism. Even though simplicity justifies this, more interesting results 
and questions can be discussed if we make 5 determined endogenously. Also, 
another simplification is that the commitment of punishing is always credible. 
In real world, some forms of contracts are done in this fashion by depositing 
some of money to a third party for the case of non-fulfillment. However, partial 
credibility of commitment may reveal more interesting and unexpected results 
on the issues of this paper. 

Appendix A The Stochastic Dynamics of 
Generalized Moran Process 

Moran process 

Moran process is a classical model of population that is developed in pop- 
ulation genetics, and has been imported to game theory recently. In every 
time step an individual is randomly chosen for reproduction by its fitness, and 
makes a single clone that replace a randomly selected other member. Moran 
process represents a simple birth-death process. For the whole process, the 
size of total population, M, remains constant, i.e., Moran process ignores ef- 
fects of population size. This assumption of exogenous finite population size 
can be considered as an approximation to a model where environmental forces 



keep the population from becoming infinite (Fudenberg et al. 2004). 

For studying finite populations, it is convenient to transform fitness into 
convex combination of baseline fitness (generally assumed to be 1) and payoff 
obtained from interaction. That is, / = (1 — s)l + sir where / is fitness of 
a player, ir is the payoff from the game, s controls the intensity of selection. 
When s = 0, selection is neutral and we have random drift. For s — > 1, fitness 
can be equated to payoff. Since / should be positive, there exists maximum 
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s. 



Fixation probability 

Repeatedly applying Moran updating determines the evolutionary result of 
residents and mutants. In the absence of mutations, which is in the spirit of 
literature on large deviations of long-run behavior, Moran process ends up with 



a homogeneous population with all residents or all invaders (Foster and Young 



1990; Kandori et al. 1993 Young 1993 Kandori and Rob, 1995). Regardless 



of initial state of population, eventually all members of the population consists 
of one type. When this homogeneous state by one type is realized, conquering 
type is said to reach fixation. It is the key to this dynamics to find fixation 
probabilities of types in the population. 

Let us explain how to find fixation probabilities by two-strategies case. For 
M-size population, the number of ^4-strategy players is j, and the number of 
B is M — j. The probability to increase the number of A from j to j + 1 is 
denoted by T? . Similarly, TJ is probability to decrease j by 1. Considering 
that there exist two absorbing states with no-mutation game dynamics, two 
fixation probabilities is given by 



<j)o = and 4>m = 1 

where 4>j is the fixation probability where the number of A is j. For interme- 
diate state, the fixation probability are given by 



6 i = rr^ i _ 1 + (i-r; 



(A.l) 



which is an expression of fixation probability by its one back-and-forth time 



step. Rearrange (A.l) makes 

O = -Tr(0 i -0 i _ 1 )+T/(0 i+1 



"3J- 



(A.2) 
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(A.2) can be suitably used to make a recursion for the differences between 
fixation probabilities. For our discussion (f>i, the fixation probability of a single 
A individual, is particularly important. By some algebra, this is calculated as 



1 

AI-1 k rp- 
k=l j=0 3 



(A.3) 



It is possible to calculate fixation probability for any initial state of existing 



i-number of A, fa (Nowak et al. , 2004; Taylor et al. , 2004). Only (\>\ is needed 
to investigate stationary distribution with small mutations. 

For neutral selection where drift is purely random, T~ = Tj~ holds, hence 
<f>i is easily given by 1/M. This fixation probability of random drift is used 
to judge how strong a single individual enough to fixate whole population. 
When the fixation probability of a specific individual of a type is larger than 
1/M, there is a statistical tendency for this type to occupy the whole popula- 
tion. Otherwise, this type is easy to be fixated by other types whose fixation 
probabilities are larger than 1/M. This criteria about fixation has a good 
interpretation to describe mutual invasion between two types, which is useful 
for our purpose. 7 . 



Appendix B Payoffs of PGG 

We denote the number of cooperator by c, defector by d, loner by I, and 

punisher by p. Naturally, M = c + d + l + p holds. Also, 0<a + l<r<G, 

and G > 3 are assumed for relevant discussion. 7r C£ ,, the expected average 

payoff of focal C against D, is given by 

7 When fixation probability from A to B is smaller than 1/M, we can ignore this direction 
of movement. This qualitative approach makes analysis simpler and illustrative as following 
automata-style diagram shows 
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" CD £s (£2) V o ''- > 

v v /V v ' 

(i) (ii) 

where 

(i) is the probability that there are k number of C and G — k number of 
D when G-sized team is made from M-population, (ii) is the average payoff 
from /e-number C team. Relevant payoffs for D and L are given by 



G-l tc\ /M-c 



7T r 



= E 



(fcXc-fc) f fc 



k=0 \G-l) 



G 



a 



71", 



1 



(fc 1 ) 



(r 



Interactions of C, D and L with respect to P are specified by the type of 
P. When punisher is cooperative type, CP, related payoff are 



ir CP = 7Tp C = r - 1 



G-l (d-l\ ( M-d \ 
\ k ) \G-k-l) 

k=0 \G-l) 

G-l (p-l\ ( M-p \ 
\ k J \G-k-l) 



k=0 
G-l 

E 



7T 



7T r 



k=0 
a 



i M G--l) 



G-k-1 
G 

k + 1 



r — (G — k — 



G 



r - 1 - (G - k - 1) 7 



(M-p\ G-l fp-l\ ( M-p \ 



PL (M-l\~ ' (M-l\ 
\ G-l ) k=l \G-l) 

The payoffs for OP are 
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G-l (c-U I M-c \ 

E\ k ) \G-k-l) 
/Af-1\ 
k = \G-l) 

G-l (P-l\ I M-p \ 

E \ k }\G-k-l! 
/M-1n 

k=o Kg-i) 

G-l /d-ll I M-d \ 
\ k )\G-k-l) 



(fe+l) + 5^=|=i(G-fe-l) 
G 



(G-fc-l) + ^(fc + l) fe + 1 i + l,.,. ± , 
q r (1 - «-g-fc)(/3 + 7) 



E 

fc=0 
G-l 

E 

k = 



(M-l) 
k = \G-l) 

I M 

\G- 



{G-k- l)<5i 



G-l (P-1\ I M-p \ 
V k )\G-k-l) 



i M G--l) 



G 

(fe + l)«^i t + i 



(G — fe — 1)0 



(g - k - 1 + (1 - «^±i)fc) 7 - (1 - 



/M-p\ G-i /p-iy M-p \ 



PI — /M 



(g-i) 



^+E 



{ M G --D 



(r-1). 



For Moran process, transition probability for one forward step is given by 

r + _ mj[(l - s) + s^] M - nij 

ij ~ mj[(l - s) + svrjj] + (M - mj)[(l - a) + stt^] t M ' 

probability for i's reproduction probability for type j's death 

where rrii is the number of type i, and i,j G {C, D,L, P}. the probability for 

one backward step is given by 

T - = (M - mj)[(l - a) + sTTjj] m, 

~ mj[(l - s) + s7Tij] + (M — mi)[(l - a) + STTy] M ' 

Fixation probability for z against j is 



4j = 



13 M-l k rp- M-l k 

En til y l r i - s + 
11 T + 11 i _ S + S7r 

fc=0 mi=l »J fc=0 mi=l tJ 

As the fitness should be positive for proper 0^- , an upper limit on s is given 
by 1/(1 - min7Tjj). 

Fixation probabilities are used for making a Markov transition matrix 
between four different homogeneous states, which is 

( 1 - *DC - *LC ~ <t> PC <t>CD <t>CL <t>CP 

$DC 1 - 'f'CD ~ <^LD - ^PD 4>DL <<>DP 

<f>LC <t>LD 1 -<t>CL ~ <t>DL ~ <*> P L <t> LP 

\ <t> PC <P PD 4> PL 1 - 4> CP - <t> DP - 4> LP 
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Above matrix defines entry and exit between four homogeneous states. 
For example, first row describes how one mutant C influences system. The 
second elements of this row shows the probability that a single mutant C 
conquers or fixates D-homogeneous state. Naturally, sum of each column 



should be 1. Fudenberg and Imhof (2006) shows that normalized eigenvector 



to the biggest eigenvalue, 1 for this case, determines stationary distribution 
for small mutations. 

Appendix C Replicator Dynamics and Stable NE 

Replicator dynamics for PGG in the paper can be derived by formulating 
payoff for each type of player. For C — D — CP, 



7r c = r(c + p) — 1 

tt d =r(c + p)-pp(G-l) 

TT P = r(c + p) - 1 - -yd(G - 1) 

where c, d and p denotes relative frequency of C, D and P in a infinitely 
large population respectively. This system can be treated by a system of 
linear differential equations, and phase diagram and stable NE can be easily 
given. Phase diagrams of Figure [7] shows two types of equilibrium state. For 
P > 1/(G — 1), multiple stable NE are obtained as (a) of Figure [FJ That is, 
when punishment are sufficiently effective, the continuum of C — CP mixture 
can be supported as stable NE. For (3 < 1/(G — 1), D-state is unique NE. 
Without exit option, stochastic dynamics selects D-state as unique equilibrium 
for C - D - CP. 

For C — D — L case, payoffs are given by 



25 




CP D 




(a) Multiple stable NE 



(b) One stable NE 



Figure 7: Solid circle represents stable NE, and empty circle does unstable fix 
points, (a) shows that C — CP co-existence can be stable equilibria for some 
area of S3 simplex, (b) shows that this evolution of cooperation disappears 
when underlying conditions turn severe. Figures are generated by the modified 
version of DYNAMO originally written by William Sandholm, Emin Doku- 
maci, and Francisco Franchetti. (http://www.ssc.wisc.edu/~whs/dynamo/ 
index.html) 



vr c = (l-| G - 1 )(rc-l) + Z G-1 <r 
7r D = (l-lG- l )(rc) + l G - l a 
ir L = a, 

where I denotes the relative frequency of L. Different from C — D — CP case, 
C — D — L cannot be easily treated because L makes the system nonlinear 
one. 



Brandt et al. (2003) gives a trick to formulate replicator dynamics. This 
makes use of the fact that payoff difference between C and D depends only on 
I. Three homogeneous states are natural fixed points. There are no other fixed 
points on the boundary of S s simplex. For r > 2, unique rest point in interior 
of S 3 , and interior dynamics can be described by Hamiltonian system. This is 
equivalent to rock-paper-scissor dynamics where rest point is surrounded by 
periodic orbits as is shown in Figure |8j 
Finally, payoff for OP case are given by 
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(a) One Stable NE and cycle 



(b) One Stable NE 



Figure 9: (a) shows the case for two stable NE, which are -D-state and C — OP 
coexistence, (b) shows the case for one stable NE, D-state. 



tt c = r(c + (5p)p) - 1 

tt d = r{c+(Sp)p)- fo(G-l) 

tt p = r (c + (8p)p) -Sp- 7 (d + (1 - 5p)p)(G - 1) - (3(1 - 5p)p(G - 1) 
By using similar method of C — D — L case, it can be checked that there 
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is no fixed point in interior of S3 simplex, and unique NE exist at D and/or 
C — OP boundary. 8 Fig. [9] shows two typical dynamics for C — D — OP case. 
For 7 > 0, the condition is given by 



(a) f ° r/3> G3I' _ 1 _ /? + G/3 _ 7 + G7 <*<! 

(b) for otherwise. 

Cooperative equilibrium made by OP in stochastic dynamics is the case of 
(a) with low j3, low 7 and high 5. p OP in (a) of Figure |9j is given by ^ +1 ^g-i) ■ 
High f3 and 7 make p OP small, which is that region for evolutionary cycle is 
enlarged. Otherwise, when region for evolutionary cycle shrinks, population 
consists mostly of C and OP, which can be regarded as cooperative state. 
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